|
The Oxford English Corpus is a text corpus of 21st century English, used by the makers of the ''Oxford English Dictionary'' and by Oxford University Press's language research programme. It is the largest corpus of its kind, containing nearly 2.5 billion words.〔 It includes language from the UK, the United States, Ireland, Australia, New Zealand, the Caribbean, Canada, India, Singapore and South Africa.〔 The text is mainly collected from web pages; some printed texts, such as academic journals, have been collected to supplement particular subject areas.〔 The sources are writings of all sorts, from "literary novels and specialist journals to everyday newspapers and magazines and from Hansard to the language of blogs, emails, and social media".〔 This may be contrasted with similar databases that sample only a specific kind of writing. The corpus is generally available only to researchers at Oxford University Press, but other researchers who can demonstrate a strong need may apply for access. The digital version of the Oxford English Corpus is formatted in XML and usually analysed with Sketch Engine software.〔(Technical information ). Retrieved February 4, 2014.〕 Each document in the OE Corpus is accompanied by metadata naming: *title *author (if known; many websites make this difficult to determine reliably) *author gender (if known) *language type (e.g. British English, American English) *source website *year (+ date, if known) *date of collection *domain + subdomain *document statistics (number of tokens, sentences, etc.)〔 ==See also== * British National Corpus * Corpus of Contemporary American English (COCA) * American National Corpus * Frequency analysis 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Oxford English Corpus」の詳細全文を読む スポンサード リンク
|